17 research outputs found
Recommended from our members
The simulation of fluid flow processes using vector processors
In this thesis the potential gains in vectorisation of linear and non-linear systems of equations are investigated. Previous studies carried out on the suitability of algorithms for vectorisation have been based on the solution of Poisson's equation. In accordance with this, a range of algorithms are explored and compared using a VA-1 pipeline processor attached to a MASSCOMP MC5400. Analysis shows that almost full vectorisation is possible leading to speed-up factors of up to 90. Based on these results the vectorised conjugate gradient with a Jacobi preconditioner (JCGV) is the best of the algorithms considered.
This work is extended to the development of a two-dimensional fluid flow code which is used to solve the Navier-Stokes equations, SIMPLE is implemented to handle the non-linear nature of the equations. The first two problems are isothermal flows, viz, the 'moving lid cavity' and the 'sudden expansion in a duct' problem. A study of where the greatest computational effort is expended, and subsequent vectorisation leads to 98% of SIMPLE being modified. This results in speed-up factors of 6 for the cavity problem and 29 for the sudden expansion problem. In both problems the JCGV is marginally faster than the vectorised Jacobi with under-relaxation (JURY). However, the JCGV algorithm is not robust and it is necessary to relax carefully the approximation, otherwise high computation times or divergence is likely.
Two further problems are considered each with increasing complexity, these include scalar quantities of temperature and characteristics of k-e turbulence. One problem is based on 'turbulent L-shaped flow in a duct' and the other on the 'natural convection in a square cavity'. A consequence of the higher scalar computation gives speed-up factors of 5 for the turbulent L-shaped flow and 11 for the natural convection problem. There is little to choose between the JCGV and JURV algorithms, however, the robustness problems with the JCGV algorithm remain.
A multigrid method (ACM) is used to improve the convergence rate of the algorithms, particularly as the size of problem is increased. Although it is more effective in scalar, it also provides worthwhile improvements for the vectorised algorithms with overall factors of 8.5. Convergence difficulties with the JCG algorithm also prevents the combination with the ACM method. Therefore, the vectorised JUR algorithm with the ACM method is not only more efficient and reliable, but also has scope for improvement as the grid is increased.
The potential gains in vectorisation of the SIMPLE family on pipeline architectures have been clearly demonstrated and indicate that such efforts on practical CFD codes should be well rewarded with regard to processor performance
Recommended from our members
Intelligent and predictive vehicular networks
Seeking shortest travel times through smart algorithms may not only optimize the travel times but also reduce carbon emissions, such as CO2, CO and Hydro-Carbons. It can also result in reduced driver frustrations and can increase passenger expectations of consistent travel times, which in turn points to benefits in overall planning of day schedules. Fuel consumption savings are another benefit from the same. However, attempts to elect the shortest path as an assumption of quick travel times, often work counter to the very objective intended and come with the risk of creating a “Braess Paradox” which is about congestion resulting when several drivers attempt to elect the same shortest route. The situation that arises has been referred to as the price of anarchy! We propose algorithms that find multiple shortest paths between an origin and a destination. It must be appreciated that these will not yield the exact number of Kilometers travelled, but favourable weights in terms of travel times so that a reasonable allowable time difference between the multiple shortest paths is attained when the same Origin and Destinations are considered and favourable responsive routes are determined as variables of traffic levels and time of day. These routes are selected on the paradigm of route balancing, re-routing algorithms and traffic light intelligence all coming together to result in optimized consistent travel times whose benefits are evenly spread to all motorist, unlike the Entropy balanced k shortest paths (EBkSP) method which favours some motorists on the basis of urgency. This paper proposes a Fully Balanced Multiple-
Candidate shortest path (FBMkP) by which we model in SUMO to overcome the computational overhead of assigning priority differently to each travelling vehicle using intelligence at intersections and other points on the vehicular network. The FBMkP opens up traffic by fully balancing the whole network so as to benefit every motorist. Whereas the EBkSP reserves some routes for cars on high priority, our algorithm distributes the benefits of smart routing to all vehicles on the network and serves the road side units such as induction loops and detectors from having to remember the urgency of each vehicle. Instead, detectors and induction loops simply have to poll the destination of the vehicle and not any urgency factor. The minimal data being processed significantly reduce computational times and the benefits all vehicles. The multiple-candidate shortest paths selected on the basis of current traffic status on each possible route increase the efficiency. Routes are fewer than vehicles so possessing weights of routes is smarter than processing individual vehicle weights. This is a multi-objective function project where improving one factor such as travel times improves many more cost, social and environmental factors
Recommended from our members
Towards web usage attribution via graph community detection in grouped internet connection records
Internet connection records can be very useful to digital forensic analysts in producing Internet history timelines and making deductions about the cause and effect of activity. However, the available data may include only a subset of the data that would be available from physical extraction. For example, the new UK legislation allows the collection of host website details, time of access and subscriber details, but not the specific uniform resource locator visited. Here, we investigate how to process data from Internet connections records to extract the websites, and construct the sessions of activity that are likely to be idiosyncratic the individual users, from the set of multiple possible users. We demonstrate how to display Internet history sessions as a network and perform graph community detection, showing a scheme for breaking up the component parts of the Internet history sessions into groups. We also introduce the use of websites’ relative popularity for identifying websites that are likely to be meaningful to particular users of particular devices, further improving the accuracy of attributing a particular activity session to a particular user at a particular point in time
Facilitating forensic examinations of multi-user computer environments through session-to-session analysis of internet history
This paper proposes a new approach to the forensic investigation of Internet history artefacts by aggregating the history from a recovered device into sessions and comparing those sessions to other sessions to determine whether they are one-time events or form a repetitive or habitual pattern. We describe two approaches for performing the session aggregation: fixed-length sessions and variable-length sessions. We also describe an approach for identifying repetitive pattern of life behaviour and show how such patterns can be extracted and represented as binary strings. Using the Jaccard similarity coefficient, a session-to-session comparison can be performed and the sessions can be analysed to determine to what extent a particular session is similar to any other session in the Internet history, and thus is highly likely to correspond to the same user. Experiments have been conducted using two sets of test data, where multiple users have access to the same computer. By identifying patterns of Internet usage that are unique to each user, our approach exhibits a high success rate in attributing particular sessions of the Internet history to the correct user. This can provide considerable help to a forensic investigator trying to establish which user was using the computer when a web-related crime was committed
Recommended from our members
An extension of the standard Omega test for the parallelisation of computational mechanics codes containing arrays with mapped indices
Recommended from our members
Computer aided parallelisation of unstructured mesh codes
Abstract not availabl
Recommended from our members
Parallel bandwidth characteristics calculations for thin avalanche photodiodes on a SGI Origin 2000 supercomputer
An important factor for high-speed optical communication is the availability of ultrafast and low-noise photodetectors. Among the semiconductor photodetectors that are commonly used in today’s long-haul and metro-area fiber-optic systems, avalanche photodiodes (APDs) are often preferred over p-i-n photodiodes due to their internal gain, which significantly improves the receiver sensitivity and alleviates the need for optical pre-amplification. Unfortunately, the random nature of the very process of carrier impact ionization, which generates the gain, is inherently noisy and results in fluctuations not only in the gain but also in the time response. Recently, a theory characterizing the autocorrelation function of APDs has been developed by us which incorporates the dead-space effect, an effect that is very significant in thin, high-performance APDs. The research extends the time-domain analysis of the dead-space multiplication model to compute the autocorrelation function of the APD impulse response. However, the computation requires a large amount of memory space and is very time consuming. In this research, we describe our experiences in parallelizing the code in MPI and OpenMP using CAPTools. Several array partitioning schemes and scheduling policies are implemented and tested. Our results show that the code is scalable up to 64 processors on a SGI Origin 2000 machine and has small average errors
Recommended from our members
Parallelisation and performance evaluation of the aeronautical CFD flow code ESAUNA
The parallelisation of an industrially important in-house CFD code for calculating the airflow over complex aircraft configurations using the Euler or Navier-Stokes equations is described. The code uses a novel grid system which may include block-structured hexahedral grids, unstructured tetrahedral grids or a hybrid combination of both. Some details of the parallelisation approach are discussed and performance results for industrial scale test cases running on three parallel platforms: Cray T3D, IBM SP2, Parsytec GC/PP are presented. Some lessons learned during the project are briefly noted
Recommended from our members
A multi-objective evolutionary algorithm for portfolio optimisation
The use of heuristic evolutionary algorithms to address the problem of portfolio optimisation has been well documented. In order to decide which assets to invest in and how much to invest, one needs to assess the potential risk and return of different portfolios. This problem is ideal for solving using a Multi-Objective Evolutionary Algorithm (MOEA) that maximises return and minimises risk. We are working on a new MOEA loosely based on Zitzler's Strength Pareto Evolutionary Algorithm (SPEA2) [20] using Value at Risk (VaR) as the risk constraint. This algorithm currently uses a dynamic population in order to overcome the problem of loosing solutions. We are also investigating a dynamic diversity and density operator